Load Balancing of Parallelized Information Filters
نویسندگان
چکیده
ÐWe investigate the data-parallel implementation of a set of ªinformation filtersº used to rule out uninteresting data from a database or data stream. We develop an analytic model for the costs and advantages of load rebalancing the parallel filtering processes, as well as a quick heuristic for its desirability. Our model uses binomial models of the filter processes and fits key parameters to the results of extensive simulations. Experiments confirm our model. Rebalancing should pay off whenever processor communications costs are high. Further experiments showed it can also pay off even with low communications costs for 16-64 processes and 1-10 data items per processor; then, imbalances can increase processing time by up to 52 percent in representative cases, and rebalancing can increase it by 78 percent, so our quick predictive model can be valuable. Results also show that our proposed heuristic rebalancing criterion gives close to optimal balancing. We also extend our model to handle variations in filter processing time per data item.
منابع مشابه
Load Balancing Using a Best-Path-Updating Information-Guided Ant Colony Optimization Algorithm
Abstract: Load balancing and phase balancing are important complement to reconfiguration of the feeder and the network.In the distribution automation ,these issues must be solved continuously and simultaneously to ensure the optimal performance of a distribution network.Distribution network imbalance has various consequences such as increase in power losses, voltage drop,cost increase,etc.In th...
متن کاملAutomatic selection of load balancing parameters using compile-time and run-time information
Clusters of workstations are emerging as an important architecture. Programming tools that aid in distributing applications on workstation clusters must address problems of mapping the application, heterogeneity and maximizing system utilization in the presence of varying resource availability. Both computation and communication capabilities may vary with time due to other applications competin...
متن کاملAn Algorithm for Dynamic Load Balancing of Synchronous Monte Carlo Simulations on Multiprocessor Systems
We describe an algorithm for dynamic load balancing of geometrically parallelized synchronous Monte Carlo simulations of physical models. This algorithm is designed for a (heterogeneous) multiprocessor system of the MIMD type with distributed memory. The algorithm is based on a dynamic partitioning of the domain of the algorithm, taking into account the actual processor resources of the various...
متن کاملLoad Balancing Approaches for Web Servers: A Survey of Recent Trends
Numerous works has been done for load balancing of web servers in grid environment. Reason behinds popularity of grid environment is to allow accessing distributed resources which are located at remote locations. For effective utilization, load must be balanced among all resources. Importance of load balancing is discussed by distinguishing the system between without load balancing and with loa...
متن کاملHandling Data Skew in Multiprocessor Database Computers Using Partition Tuning
Shared nothing multiprocessor archit.ecture is known t.o be more scalable to support very large databases. Compared to other join strategies, a hash-ba9ed join algorithm is particularly efficient and easily parallelized for this computation model. However, this hardware structure is very sensitive to the data skew problem. Unless the parallel hash join algorithm includes some load balancing mec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Knowl. Data Eng.
دوره 14 شماره
صفحات -
تاریخ انتشار 2002